Scalability Analysis of Declustering Methods for Cartesian Product Files
نویسندگان
چکیده
Efficient storage and retrieval of multi-attribute datasets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files over multiple disks to obtain high performance for disk accesses. Though the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper we derive formulas describing the scalability of two popular declustering methods Disk Modulo and Fieldwise Xor for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods and are corroborated by extensive simulation experiments. From the practical point of view, the formulas given in this paper provide a simple measure which can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions.
منابع مشابه
Scalability Analysis of Declustering Methods for Multidimensional Range Queries
Efficient storage and retrieval of multiattribute data sets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multiattribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files across multiple disks to obtain high perf...
متن کاملIndependent Study Report: Propose Partial Replication Schemes for Replicated Declustering and Compare Their Performance
In this course, under the supervision of our professor, we have focused on the techniques for declustering data into multiple disks. Our aim is to get familiar with all the related work done till today so we have started from the very beginning of declustering which is done on relational databases and Cartesian product files [4, 5]. We have explored the literature and find out the most outstand...
متن کاملStudy of Scalable Declustering Algorithms for Parallel Grid Files
Efficient storage and retrieval of large multidimensional datasets is an important concern for large-scale scientific computations such as long-running time-dependent simulations which periodically generate snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known acce...
متن کاملDeclustering Using Fractals
We propose a method to achieve declustering for cartesian product les on M units. The focus is on range queries, as opposed to partial match queries that older declustering methods have examined. Our method uses a distance-preserving mapping, namely, the Hilbert curve, to impose a linear ordering on the multidimensional points (buckets); then, it traverses the buckets according to this ordering...
متن کاملAn Optimal Disk Allocation Strategy for Partial Match Queries on Non-Uniform Cartesian Product Files
The disk allocation problem addresses the issue of how to distribute a file on to several disks to maximize the concurrent disk accesses in response to a partial match query. In the past this problem has been studied for binary as well as for p-ary cartesian product files. In this paper, we propose a disk allocation strategy for non-uniform cartesian product files by a coding theoretic approach...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996